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Executive Summary 


Government programs are characterized by complex implementation challenges on the 
ground, from the geographic scale to interaction with millions of residents within a multi¬ 
layered ecosystem. Further, they must straddle the multiple goals of universality, ubiquity, 
inclusiveness, uniformity and conformity. The Unique Identification Authority of India (UIDAI] 
is working to provide residents of India a Unique Identification number (called Aadhaar], The 
authority, in a short span, is set to become the largest biometric capture and identification 
project in the world. Managing these complexities in growing and maintaining large 
ecosystems has required UIDAI to leverage innovations across technology and operations. A 
strategic initiative adopted by UIDAI from the design stage has been the extensive usage of 
Analytics and Reporting to aid operations. 

Reporting is the process of sharing data related to an organization with key stakeholders. 
Analytics is the structured process of analyzing this data to derive insights that help operations. 
UIDAI’s experience, as well as emerging academic research, indicates that Analytics and 
Reporting delivers concrete benefits to the end-to-end operations. 

These benefits span tactical, operational and strategic levels, helping move decision making 
from "intuition based” to "data based”: 

1. Creating information conduits providing End-to-End integrated visibility for management 
across the entire ecosystem; 

2. Having a common language derived from a single source of truth [data] helps the 
entire ecosystem communicate and coordinate; 

3. Real-time feedback loops to enable continual fine-tuning of operations; 

4. Increasing transparency of the system, internally and externally; 

5. Improvement in delivery of services, reducing leakages and delivering it to the 
right beneficiaries; 

Government programs stand to further gain by the usage of Analytics and Reporting. With the 
move to digitization in many programs, huge amounts of data are getting generated. Having a 
clear Analytics and Reporting strategy in place can ensure this data is harnessed and used to 
improve operations and delivery of services. Further, Analytics can be used at a strategic level to 
shape and execute public policy priorities in resident facing applications. 

At UIDAI, Analytics and Reporting has been a constituent of the UIDAI implementation strategy 
from inception. BI and Reporting modules have been part of the IT architecture design and a 
dedicated BI and Reporting team in place. A cross-functional Analytics and Continuous 
Improvement council was created at an early stage to suggest and oversee usage of Analytics 
across the organization. 






Creating an Analytics and Reporting function involves recognizing that building IT systems 
is not the end point; rather it is the starting point in terms of generating data. Success of this 
function depends on recognizing that data is the platform from which multiple decisions are 
enabled and not hardware or software. A key starting point in creation of the function is to 
ensure Business Intelligence (BI) systems are part of the overall IT architecture design and 
strategy. This is because Analytics and Reporting spans not just publishing reports, 
but includes data capture, integration and management; as well as analysis of data using 
dedicated tools and experts. 

The function can comprise of three broad parts: 

1. Business Intelligence framework that captures and manages data. They provide 
the extensible infrastructure platform, framework and associated tools for Analytics. The 
long-term reporting requirements need to be separated from operational (in-process) 
requirements by providing different databases to handle. 

2. Delivery platforms like Email/FTP/Portal/Mobile create the information 
conduits through which data is shared with the organization in various 
formats (Excel/PowerPoint/Charting). An online analytics delivery platform 
is recommended. 

3. Delivery team would enable the function. A structure that is used in the Analytics 
and Reporting function is a combination of "End-user” (process owners who are 
the actual consumers of analytics) focused teams along with "support teams” that 
have been built for specific competencies. Two broad set of support teams focused 
around Technology and Specialized Analytics can be envisioned. 

Certain design principles including a focus on stakeholder requirements, flexibility, scaleability 
and self-service based can help ensure the outputs are relevant and useful for the organization. 
A core team can be built in-house that is supplemented by external vendors providing specific 
skill sets. The Analytics and Reporting team, core and extended, should ideally be co-located. 
Ensuring the right mix of technical (Warehouse, Analytics software) and business skill sets 
(Analysis, Insights) is important in building the team. A dedicated portal development team 
from the Technology vendor is required to run and maintain a web-based Analytics 
delivery platform. 

Hardware requirements for Analytics and Reporting need to be addressed separately since it is 
typically among the most compute intensive sub-system. It is recommended that the Data 
Warehouse storage and processing hardware be created independently, rather than using live 
production systems for Reporting. Software requirements would include Business Intelligence 
tools, Visual Analytics software, advanced modelling software and Data security software. 
Processes and a governance structure should be put in place to ensure data privacy 
and security. 

Implementation of a strategic initiative of this nature is likely to face challenges. Some of these 
challenges are technological, but the more difficult ones are behavioural. Senior leadership 
support is critical to the successful implementation of this transition. Implementing this 
requires a carefully thought out, phased transition plan that is a combination of short term and 
longterm milestones. 
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Introduction 


The Unique Identification Authority of India (UIDAI], attached to the Planning Commission of 
the Government of India, is working to provide residents of India a Unique Identification 
number (called Aadhaar] linked to the resident’s demographic and biometric information. The 
project aims to create a platform that serves as an "identification infrastructure” for the 
delivery of public and private services to the residents of India. 

The project began issuing Aadhaar numbers in September 2009, and as of January 2012 had 
already issued more than 110 million Aadhaar numbers. More than 40 million additional 
enrolments have been completed on the field across almost all states. The project has involved 
taking cutting edge technology and devices to each individual resident. This has been 
accomplished in a short timeframe adding additional layers of operational complexity. 
Reaching this scale of operations has required UIDAI to create and operate a large eco-system. 

The Aadhaar project is set to become the largest biometric capture and identification project in 
the world. In its implementation, the Aadhaar project has leveraged many innovations across 
technology and operations. An important strategic initiative adopted by UIDAI from the design 
stage has been the extensive usage of Analytics and Reporting. 

Analytics based on usage of macro-economic data and government published datasets (e.g. 
Census] has been prevalent. However, this document particularly deals with Analytics on data 
generated by organizations themselves to aid their own operations and strategies. This 
document describes specifically how UIDAI is benefiting by leveraging Analytics and 
Reporting in its end to end operations. 

With the move to digitization, almost all government programs are generating data in each of 
their transaction points. For e.g., creation of a Ration card or linking the family members of a 
ration card leads to creation of operational data. Many public sector enterprises already have 
large datasets related to their operations. Having an Analytics and Reporting strategy in place 
ensures that this data is harnessed and used to enhance operations in an efficient manner. The 
document shows how it is as relevant for other Government programs and provides a basic 
template and guiding principles to build an Analytics and Reporting function. 
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What is “Analytics and Reporting"? 


Reporting is the process of sharing data related to an organization with key 
stakeholders. Analytics is the structured process of analyzing this data to derive 
insights that help operations. 

While there is a general understanding of "reports” an analogy of water source can be 
considered to understand the full spectrum of Analytics and Reporting within organizations. 
Data can be considered as the equivalent of water. There are a number of processes involved 
before the actual consumption of water and data. The journey begins with data, like water, 
being generated at multiple sources. These are then brought together into one central location. 
Data in this central source then needs to be cleaned, processed and brought to defined 
standards before it can be shared ahead. Information is then delivered to end-consumers 
through a series of conduits in the form of multiple delivery platforms. Analytics and Reporting 
does not end at just delivering data to users. Analytics involves usage of this data for further 
running operations. 



Reporting Analytics 

Figure 1: Components of Analytics and Reporting 


Reporting can be broken down into two broad areas - collecting / managing data within the 
system (processing) and sharing it with the organization (Reports). It is observed that 
"Reports” are strongly associated with the entire Reporting process, though they 
represent only the visible part of the process (output). 

Data is generated in an organization from multiple sources. Some data is captured 
automatically while some is manually generated or captured by third-party sources. Core 
operations of organizations, typically directly linked with their IT infrastructure, tend to 
automatically capture data. Processes that are not directly connected with the IT 
infrastructure, or that were not designed to capture data, generate data that needs to be 





















manually input (typically field level operations data done in offline/paper mode). As an 
example, operations related data like number of enrolments, processing etc is captured directly 
by U1DA1 systems. Data like number of enrolment centres (different from enrolment stations), 
enrolment form number are either not captured or fed manually in the system. Finally, data 
could also be provided by sources beyond the organization - these could be the vendors 
of the organization or independent sources like Census, market surveys etc. 
In UIDAI’s case, the data provided by its certification agency (Sify) and delivery partners 
(India Post) is captured independently by them and shared in a defined format. 

Data integration and management is an important part of Reporting as well. This typically 
involves integrating the various sources of data together and creating a platform (data 
warehouse) which provides easy access to this data. Data typically is stored in databases. 
However, as organizations, users and size of data grows larger; direct extraction from raw 
databases is not scalable. Tools hence need to be configured that help access this data in a 
convenient, fast and secure manner. These activities within Reporting are typically more 
technology focused and involve specialized skill sets related to Data Warehouse and Business 
Intelligence. UIDAI uses HDFS (Hadoop Distributed File system), an advanced data storage 
system from which data is extracted and managed using specialized tools (called Business 
Intelligence tools). 

Sharing of processed data is the most visible part of Reporting. This is typically done through 
Excel, PowerPoint, Web-Portals, Dashboards (a screen with all key pieces of data at one place) 
etc. Depending on the organization's maturity, this sharing spans basic data dumps to 
interactive usage of data. In the case of UIDAI, there is a dedicated reporting portal with 
standardized reports, dashboards available to users when the log in. 

Analytics, on the other hand, is the structured process of analyzing this data to help improve 
operational decision making. In the absence of a structured process, most users tend to use 
their "gut” past experiences or data analysis on an ad-hoc basis to make decisions. Analytics is 
the process of doing this in a structured manner, using a combination of data (science), tools, 
skilled resources and operational experience. The analysis could be basic analysis of trends and 
charts using Excel / Graphs or sophisticated statistical analysis of large amounts of data. This 
analysis is used to identify trends or anomalous behaviour, understand the root cause of the 
issues and conduct forecasting. The insights from Analytics guides decision makers to gain a 
deeper understanding of their operations. The Analytics team at UIDAI works closely with 
process teams to analyze data like productivity, biometric quality and help them 
improve their operations. 

While organizations tend to have "reports” available in some form, Analytics and Reporting 
involves having a dedicated team working closely with Technology (IT) and program 
stakeholders to provide them data and insights on a regular basis. Data here is the "hero” - the 
centrepiece that anchors decision making. Hence the function needs to be treated differently, 
and overtime decoupled, from systems and infrastructure architecture considerations. 



3 "Analytics and Reporting" should be 

an essential part of Government programs 


The government runs public service programs that are huge in scale, having many moving parts 
and complex implementation on the ground. Further, several large scale programs have been 
launched in the past few years (e.g. MNREGS, SSA, NRHM etc) and more initiatives continue to 
be launched. Government programs are characterized by complexity and challenges on the 
ground; mainstreamed usage of "Analytics and Reporting” helps meet some of these. 

3.1. Nature of Government programs is extremely demanding 

Government programs at all levels are characterized by the scale they operate at. They must 
straddle the varied goals of being universal (anyone can have access), ubiquitous (anywhere 
access), inclusive (no bias), uniform (in line with national goals) and yet diverse (to be relevant 
to every region, community, geography, language, customs). 

They involve interacting with millions of residents regularly. This scale translates into billions 
of transaction points between residents and the government in the delivery of these 
public services. The delivery of these public services often involves operating in a 
multi-layered ecosystem within the government itself. This spans multiple ministries, central- 
state-district-local administrations, partners across public sectors and private sector firms and 
vendors. Further, any implementation within the government framework requires strong 
compliance with government norms and processes. These include financial norms as well as 
process norms and Service Level Agreements (SLAs) while minimizing leakages. 

The UIDA1 implementation bears out all of these challenges. The U1DA1 ecosystem today has 
more than 60 Registrars; including State governments, multiple central ministries and public 
sector enterprises that coordinate enrolment in the field. They in turn employ in excess of 100 
private sector agencies that physically enrol residents in the field. In addition, partnership with 
banks for Financial Inclusion and multiple enterprises for Authentication (actual usage of the 
Aadhaar number on the field) is taking off. 

Managing this ecosystem required the creation and operation of a distributed ecosystem. 
Historically, management has been vertically integrated where all the resources are owned by 
the implementing agencies and where they are not, they are done by dictat. UIDAI’s model is 
one where responsibility is distributed, but it is brought together through incentive alignment, 
contracts/MOUs and technical interfaces like APIs. 

3.2. Analytics and Reporting is helping meet some of these demands 

UIDAI’s experience as well as emerging academic research indicates that Analytics and 
Reporting delivers concrete benefits to organizations. Reporting and Analytics has been a key 
constituent of the UIDAI implementation strategy from inception, with the Bland Reporting 




modules beingpart of the IT architecture design. A cross-functional Analytics and Continuous 
Improvement council was created at an early stage to suggest and oversee usage of Analytics 
across the organization. The technology partner for UIDAI was also required to provide a 
dedicated Business Intelligence and Reporting team to support UIDAI. 

UIDAI has been able to leverage this Analytics and Reporting support at a tactical, operational 
and strategic level. The Analytics and Reporting function, through the council, has provided the 
UIDAI ecosystem with tools to help meet some of the challenges of scale and ramp-up 
that UIDAI has taken on. 

1. Providing End-to-End Visibility 

UIDAI today collects a huge amount of "meta-data” at each possible juncture in the Aadhaar 
enrolment and processing stages. Meta-data refers to data that describes the process, and can 
be captured at each stage of the implementation. As an example, UIDAI captures the time spent 
by an operator on each screen of the enrolment process, number of attempts, transit time, 
packet status at each stage of processing, letter delivery status etc. This, for example, is used to 
identify operators whose productivity is low and give them individual feedback to help 
improve. This also helps identify the best operators and share best practices. 

Technology helps build a complete end-to-end process using various best of breed 
components, technologies, organizations. "Data virtualization” and Reporting provide an 
integrated end to end view of the entire process, using highly granular data which provides full 
management visibility, exception handling etc. Web-based interaction serves as information 
conduits within the entire ecosystem, thus smoothening communication and coordination 
between departments. The organization becomes strongly inter-connected, in line with 
organizations that are becoming "networked”. 

2. Providing a single source of truth 

Internal and external stakeholders across the UIDAI ecosystem have access to performance and 
operational data at a click of a button through a web enabled Analytics portal. UIDAI provides 
the common data source that becomes the single source of truth for the organization. 

This has led to some significant advantages: First, decision making moves from being intuition- 
based to evidence-based'. Second, since all decisions are being made using the single source of 
truth, the chance of they being conflicting is greatly reduced. Third, the same data source is 
available to everyone, regardless of geographical location or organizational boundaries, and 
hence decisions are coordinated. Finally, future decisions are well coordinated with past ones, 
since they are based on the temporal evolution of the same source of truth. 

3. Managing through information 

Internal reviews, reviews of UIDAI with external partners like Registrars and Enrolment 
Agencies are enabled through standardized review reports delivered online every fortnight. 
Users also have the option to build custom datasets through a user interface for their own 
analysis with almost real-time data. Internally, the portal provides a unified view to the 
eco-system partners on the status of their operations. 



In a multi-layered ecosystem, and in UIDAI's case - a distributed system; data begins to serve as 
the common language. The ecosystem is 'snapped together' through technical APIs, other 
interfaces, workflow tools, legal agreements and contracts, and other coordination 
mechanisms. The data 'overlay' and analytics, reporting etc, enables it to be managed as a 
seamless system. Analytics and Reporting thus helps manage the dispersed responsibility that 
is a characteristic of a distributed ecosystem. It facilitates communication and coordination; 
helps align incentives and track against the same goals. 

4. Real time Feedback 

The data centre is the nerve centre of UIDAI where each enrolment data packet is processed to 
generate Aadhaar numbers. The "Network Operations Centre” (NOC) of UIDAI in Bangalore is a 
control room with rows of wall-mounted screens showing near real time data of processing 
stages and hardware. Alerts are triggered to the respective process owners in case of build-ups 
or hardware breakdown. 

Near real-time visibility of operational performance enables rapid feedback loops - a necessity 
to manage large and complex systems. Real time feedback also enables policy making to be 
‘data-based’ and tactical/operational moves to be continually fine tuned. Course correction 
happens faster, rather than be more of a post-mortem analysis. It also helps speed up transfer of 
learning within the organization, which typically is a slow process and people dependent. 

5. Transparency: 

The UIDAI public website (http://uidai.gov.in) and the public portal 
(http://portal.uidai.gov.in) provide a platform to share data related to operations publicly. 
The public portal provides overall enrolment stats with high degree of click-through 
granularity. Users can track the real-time status of processing of their enrolment packet. 
Aggregated and anonymized datasets are available for download by external users to conduct 
their own analysis. Payments related information is accessed and delivered through a 
standardized monthly process. Many Right to Information (RTI) and parliamentary questions 
are answered using data pulls using the internal portal. 

Analytics and Reporting greatly helps in increasing transparency within an organization by 
sharing real-time operational data across the system as well as providing End-to-End visibility 
to everyone. Many government programs are pro-actively seeking to build public facing 
"transparency portals” as well to share a lot more of information related to their users. The 
Analytics and Reporting infrastructure can help with some of these initiatives by providing the 
computational back-end (e.g. search capability for users) as well as some of the aggregated 
data. 

6. Improving delivery of services 

UIDAI used a team of specialists who analyzed huge amounts of data using statistical software 
to come up with rules for identifying potential fraud attempts by enrolment operators. Further, 
Online Authentication provided by UIDAI, will use a combination of biometric matching with 
fraud modules to identify potential fraud. State Resident Data Hubs (SRDH) will provide 
standards and APIs that will help public services clean their databases using the Aadhaar as the 



common key, as well as deliver benefits to the right people when data across departments 
are combined. 

The move to digitization can help improve delivery of public services by government programs. 
Combining this with Analytics at the back-end, offline or in real time, can improve the quality of 
this delivery by minimizing leakages at each of the billions of transaction points between 
residents and the Government. 

7. Continuous improvement: 

Poor biometric quality enrolments by Ravi, an operator with an Enrolment Agency, came down 
by 75% after a training session that used analytics to help identify specific areas of 
improvement for him. The Analytics and Reporting function analyzes and publishes data to aid 
quality improvement initiatives that UIDAI is undergoing. The combined efforts have led to an 
almost consistent increase in the quality of Biometric data being captured for UIDAI since 
March, 2 011. It is an unusual case of a system where quality has improved with expanding scale 
of operations! 

The Analytics and Reporting function can support specific initiatives that various process 
owners in organizations undertake. The function, through a team of analysts, can provide 
support to help improve their operations through operational insights. 

3.3. Using Analytics and Reporting in government programs 

The Government and public sector, with its interaction with a billion plus residents and many 
more times the number of transactions is a treasure trove of data. In fact, Government stands to 
gain among the most by using data and analysis; to not just execute programs, but also to better 
execute public policy priorities in resident facing applications. For example if we analyze public 
grievances in a city and find that biggest complaints are about water logging, appropriate 
resources can be directed to focus on drainage. Geographic level analysis can help pin-point 
pain areas and help deliver services better. 

Government programs can further bolster operations by sharing relevant analysis datasets 
(with application of suitable aggregations, anonymization and statistical disclosure limitation 
procedures] with the academia, industry and the public through a data portal. This external 
support, in the form of "crowd sourcing” can provide insights and improvements that the 
programs may themselves miss. Modelling this around participative initiatives like data.gov 
along with an API based design can allow people to analyze and mash up data and services 
across a larger section of data and for new applications. 

It is evident that the benefits outlined in the previous section can accrue to any government 
program that integrates Analytics and Reporting. The operational complexity of such 
ecosystems can be addressed by providing tools that enable better communication and 
visibility into operations. Analytics and Reporting ensures that the ecosystem as a whole 
speaks a common language. Information flows smoothly across the entire ecosystem thus 
improving communication. This flow of information can be as close to real-time as possible 



enabling rapid feedback loops and continuous improvements. Friction within distributed 
ecosystems further reduces when all SLAs are visible and can be tracked easily by all 
stakeholders. Analytics and Reporting can increasingly become a "soft infrastructure” by 
providing a reliable decision support system delivering the right data and 
insights at the right time in an easy to comprehend manner. 

4 Creating an Analytics 
and Reporting function 


Most government programs today have an Information Technology strategy in place. Creating 
an Analytics and Reporting function involves recognizing that building IT systems is not the end 
point; rather it is the starting point in terms of generating data. Hence, while the function can be 
built independently, Business Intelligence modules need to be ingrained within the overall IT 
strategy of the organization to be able to function. 

Success of this function depends on recognizing that Data is the centrepiece here around which 
the function works, and not hardware or software as is the case in a system or infrastructure 
driven IT architecture. The design should treat data as the platform from which multiple 
decisions are enabled; rather than just a technology platform that enables access to data. 
Hence, there is a need to create an independent Analytics and Reporting platform whose 
functioning is eventually decoupled from the systems and infrastructure teams. Focusing on 
the 'what' of data/information becomes the purview of this team, with the infrastructure team 
focusing on the 'how' of it. 

The structure and functioning of the Analytics and Reporting function would depend on the 
objectives and deliverables that the organization expects from it. In most cases, the Analytics 
and Reporting function acts as a support function to various processes within the organization. 

The sections that follow lay out a generic structure and some guiding principles that can serve 
as a reference for government programs when setting up or enhancing their Analytics and 
Reporting function. These have been derived from the experience within UIDAI as well as some 
of the best practices observed in industry. Details of these with specific reference to UIDAI are 
provided in section 7. 

There are 3 broad components to the Analytics and Reporting function - the Business 
Intelligence framework that captures and manages data, the Delivery platform through 
which data is shared within the organization (and potentially outside as well) and the Delivery 
team that enables the function. The delivery platform, in the context of the Analytics and 
Reporting function, is an important component since adoption of analytics is strongly driven by 
accessibility of data to users and ease of analysis. BI systems often provide for a delivery 
platform, but one should treat the design and usage in an independent manner. Each of these 
components is described below, along with an illustration of how UIDAI has applied them. 




4.1. Journey cycle of data 

The Analytics and Reporting function handles data, right from the stage of capture to the end 
analysis and insights it generates. The structure of the function and how it is enmeshed with the 
organization will thus evolve from the journey-cycle of data within an organization. 

The journey cycle of data within an organization can be understood by going back to the 
analogy of a water source referred to earlier. Creating this flow helps track each touch-point of 
the data, and consequently, what is required at each touch-point in the flow, who would control 
the flow at each stage, the delivery mechanism and security features. Figure 2 illustrates the 
same by laying out the basic data flow and the resulting Analytics structure for UIDAI. 



Figure 2: Data flow and Analytics structure in UIDAI 

At UIDAI, data generated at multiple sources would typically come to the CIDR (Central ID 
Repository), UIDAI's Data centre, through an online mechanism. There could be certain 
exceptional sources, like Contact centre or Resident consumer surveys; that will not feed into 
the Data centre directly. Data is then processed in the Data Warehouse using Business 
Intelligence tools and converted into forms that can be accessed and shared easily. The delivery 
mechanism for data in most cases is the portal. In some cases, the Analytics team will directly 
access the warehouse, work on the data and deliver the analysis results to end-users. Some of 
the End-users are shown in the diagram. 

4.2. Design principles 

The flow, structure and operations of the Analytics and Reporting function will keep evolving 
over time; it is thus important that some basic guiding principles are formulated against which 
new changes can be evaluated. Some design principles are provided below. The appendix 
provides examples of the same in the UIDAI context. 























































1. Stakeholder Focused: 

a. Broad base data across E2E process owners -Analytics function 
should have access to data across processes to be able to deliver 
in highly interconnected and distributed ecosystems. 

b. Directly relevant to the operations of End-users at all times - The Analytics 
function should NOT be conducting analysis that is cocooned in its 
OWN complexity and relevance. The analysis and insights should have 
laser-like focus on how they will improve operations. In this regard, 
having process owner specific focus and projects is helpful. 

c. Active involvement of stakeholders in defining the scope - Involvement 
of Stakeholders to define the scope of Analytics for their processes will help 
drive adoption. 

d. Making analysis consumable - Given the varied types of stakeholders with 
different exposure to Analytics, focus of all reporting and Analytics outputs 
should be to ensure ease of understanding, conciseness and ease of 
usage on the field. 

2. Facilitating Self-improvement: 

a. Provide Self-service ability: Making data and basic analysis as much self-service 
based as possible through tools and relevant metrics will help drive self-improvement. 

b. Awareness and Training: Conscious effort is required to ensure stakeholders 
are trained on how to use data and how it can benefit them. 

c. Feedback mechanism - Strong feedback mechanism on Reporting, 
especially at the initial stages, will ensure relevance and successful 
implementation. 

3. Flexible And Scale-able: 

a. Handle "Big" data: Over time, data will increase exponentially, fuelled by 
data coming in from residents, vendors and partners. Big data refers to data that is 
many orders of magnitude larger than traditional data. This size and nature of 
data makes the traditional database methodologies and technologies obsolete. 
Hence, provision should be made to ensure the system can handle this. 

b. Automation and portal based delivery: Manual creation of dashboards, 
management reports etc typically entail large teams within Analytics functions. Focus 
on automating these standardized Reporting along with delivering them through 
a portal is recommended. 



c. Consistency of data: Ensuring consistency of data across multiple sources is 
necessary to maintain quality of data. The system should be able to accommodate the 
varied types of data getting generated across the ecosystem. 

d. Modular design: The ability to 'snap' together various operating components, 
through technical APIs, interfaces, workflow tools etc is required to be to rapidly 
build scalable platforms, leveraging the best available options. 

4.3. Business Intelligence framework 

The Business Intelligence systems provide the extensible infrastructure platform, framework 
and associated tools to help meet goals of the Analytics and Reporting function. This is the 
system that captures and manages data. Consequently, the architecture would typically consist 
of the three broad sections of data acquisition, Data storage and Data distribution platform; all 
of which can be considered to be part of an over-arching Data Warehouse strategy. 

• Data acquisition would include all source systems (apps] that feed data into the 
organization. It is necessary to have the BI system integrated with all the process to 
ensure capture and consistency of data across multiple sources. In UIDAl’s case, these 
include areas like Enrolment, Authentication, Training, Resident Interactions, etc. 

• Data storage would be the repository that contains all data in its granular details. 
It is important to note that data storage for Reporting should be separated from the 
master data (data that is part of the live production systems] used by organizations. 
The necessity and advantage of the data storage being a replica and not the 
actual production systems is to ensure reporting queries do not impact live 
production systems. Also, the replica can have specific pieces of information only that 
will ensure more amounts of data and history can be stored and only 
required information is available in Reporting databases. At UIDAI, no information 
that can be personally identifiable is stored in these Data warehouses. 

• Data distribution platform provides access to data on UIDAI and its entire 
constituents. This data is presented to the end users and general public, in a timely 
manner, while still protecting privacy, confidentiality, and security. Data access 
frameworks are in place that lay out the rules around which data is distributed within 
the organization as well as shared in the public, partner and data portals. 

UIDAI employs a highly scalable, n-tier, reliable and open technology components to meet the 
UID-B1 requirements. UID BI systems consist of an Atomic Data Warehouse consisting of 
granular level data, tools and applications to provide for extraction of data from source systems 
into the Warehouse, Data distribution platform that enables provisioning of data through 
various datasets/dim-sets and Analytical delivery platform delivering relevant metrics, 
dashboards, portals etc through subject area specific Data marts. UIDAI uses open source based 
Apache Hadoop File System to handle "big data” storage requirements. Further, UIDAI uses an 
open source BI tool - Pentaho, for managing all the data in the back-end (ETLs, Access] as well 
as serve this data ahead for distribution. 

The BI data in UIDAI stores data for only those packets that have undergone processing. This 
serves as the key source of all long term reporting requirements for the organization. 



However, organizations also face the need to have a reporting system to address operational 
requirements. Operational data refers to "in-process” information. This is typically volatile 
(changes continuously) and the usage is normally for tactical needs. UIDAI maintains an 
Operational Data Store (ODS) that provides this information on an "end of day” basis. This 
provides "in-process” information like number of packets at each stage of processing, and is 
used for Reporting as well as the Network Operations Center (NOC) that tracks status of the 
system real-time. This separation of the Operational and B1 data helps UIDAI meet the short and 
long term Reporting requirements. The logical view of both of these data sources are provided 
in Section 7. 

In most organizations, the skill sets involved here (BI tools, Database specialists, Data 
Warehousing specialists) are housed within the IT function. However, they need to work 
closely, and often exclusively with the Analytics and Reporting function to ensure it meets the 
end operational requirements. 

4.4. Delivery platform 

Delivery platform is the conduit through which data is shared with the organization. It could be 
shared in the form of basic offline methods like Email/FTP uploads; various formats like 
Excel/PowerPoint/Flash dashboards or through real time visualization and charts on a web- 
based portal. 

The basic offline methods are typically the easiest and quickest to implement and help address 
many of the ad-hoc requests. However, they are also resource intensive and less visual in 
nature. Hence, the usage of more advanced formats is recommended. Additionally, web-based 
and automated delivery of data ensures consistency and security of data, as well as 
minimization of human errors. Combinations of these are typically used by organizations. 

With increased penetration of Smartphones, integration of delivery platform with Mobile, 
Email and other collaboration platforms will become important. Today, there is an increasing 
use of devices that help capture data at the transaction point itself. Analytics delivery platform 
like these will further help the front-line to make decisions to be taken on the field rather 
than incurring the delay of having to come back to base location. 

Enabling web-based delivery of Analytics and Reporting will require having a dedicated 
Analytics web portal available for users that could be integrated with the larger public portal. 
The web-portal would be able to provide functions like summary views, multi-dimensional 
analytics, highlights and reports, among other features. The delivery platform would be able to 
interact and incorporate new BI Tools, Charting software, security protocols and provide an 
easy user interface. A dedicated portal development team from the vendor will be required to 
run and maintain such a site. Having just a regular maintenance team only will not meet 
requirements. 

UIDAI has a dedicated web-based Analytics portal, along the lines described above. Details of 
UIDAI’s delivery platform are provided in section 7. Keeping in mind the lean structure of UIDAI 
and the large ecosystem partners, UIDAI has chosen to have the web-based portal as the key 
delivery platform for all the Reporting requirements. This is a combination of standardized 
reports (Figure 5 - Section 7) and dashboards (Figure 3 - Section 7), as well as interfaces to 
enable self-service analysis (Figure 4 - Section 7). 



4.5. Deliveryteam 

A structure that works well for the Analytics and Reporting function is a combination of end 
user focused teams supported by teams that have been built for specific competencies. While 
different programs may choose different structure, this structure ensures a combination of 
specialization as well as spread within the organization. 

1. End-user teams: The process owners across the End-2-End spectrum of operations 
within an organization are the actual consumers of analytics within an organization. They can 
be thought of as the equivalent to various vertical functions like Sales, Finance, Research, 
Supply Chain etc in large organizations and equivalent process functions in Government 
programs. Broad-basing Analytics and adoption of the Analytics and Reporting function across 
the organization cannot happen without focusing the deliverables around these process 
owner’s requirements. 

The End-user Analytics teams would be focused teams front-ending with their specific 
stakeholders. These teams would work closely with the process owners to understand data and 
analytics requirements and work with support teams at the back-end to ensure these 
requirements are implemented. They would also handhold the process owners initially and 
guide them on how to use data and perform basic analysis themselves. The Analytics team 
front-end members eventually become specialists in these Stakeholder operations over time. 
Section 7 provides details of the responsibilities of the team along with example of howUIDAI 
is creating these teams. 

2. Support Track: These are the equivalent of the "horizontals” in large organizations. 
These are competency based shared services teams that would provide support across the end- 
user teams. Two broad set of support teams can be envisioned - a Technology focused track and 
a specialized Analytics and Research track. These can be differentiated on the basis of being 
generic support groups across stakeholders versus support groups for niche requirements. 

The technology support track would be focused on working with the IT team to create and 
maintain the Business Intelligence and delivery platforms as mentioned in earlier sections. 
They would act as the bridge between the Technology function in organization and the 
Analytics and Reporting team. The Specialized Analytics and Research support track would 
consist of teams serving niche requirements - creating visual analytics, providing custom 
analysis, Fraud detection, Forecasting, Consumer research etc. Details with examples of these 
support teams are provided in section 7. 

The skill sets required in such a delivery team are diverse and take time to build and mature. A 
core team could be built in-house that comprises of key members of the support team along 
with the end-user teams. As the Analytics function builds up, a strategic decision will need to be 
taken on bringing in external vendors. 

The Analytics and Reporting function, core and extended, should be co-located. This team 
would be one of the few who can view the entire system simultaneously and also understand 
their interactions. Having them together will help create collaborations across processes; 



as well as ensure learning is managed and shared quickly across the organization. 

Performance of the Analytics team will largely depend on the type of talent chosen. Getting skilled 
resources who know the usage of Analytics tools, Excel, PowerPoint etc is important. However, 
getting the right people who understand the operations and environment context, provide insights 
based on them, and can work with senior leaders and process owners, are just as important and often 
harder. 

4.6. Infrastructure 

Hardware and Software required for setting and running the Analytics and Reporting function is 
strongly dependent on the existing infrastructure, budgets and Analytics structure of the 
organization. 

Analytics and Reporting is typically among the most compute intensive sub-system in the IT 
architecture spanning memory, storage space, computing power etc. The Hardware requirements 
need to grow to be capable of handling increase in size of data over time. Keeping this in mind, care 
needs to be taken to ensure that the hardware procurement process budgets for future requirements. 
For e.g., at UIDAI, the Business Intelligence Hadoop cluster is the main repository of reporting and 
Analytics data. This will keep expanding exponentially as more data comes in from Enrolment, 
followed by Authentication and update of information. 

Analytics software in the form of licenses would require to be procured. These include Business 
Intelligence tools to manage the data warehouse, Visual Analytics software to present data in an 
intuitive fashion, advanced modelling software to run statistical modelling and Data security 
software to ensure security of data being analyzed. Consideration should be made to keep in mind 
usage of Cloud based technologies where feasible. A strategic decision would need to be made on the 
usage of open source software, keeping in mind the considerations of support, cost and flexibility. 
Section 7 lists out in detail these software requirements. 

Specifications for Hardware and Software requirements for the program can be arrived at by a group 
of experts that will ensure customization as per the needs of the organization. 

4.7. Data security and privacy 

Data security and data privacy are BOTH important requirements in any organization and processes 
should be put in place that ensure full compliance with the organization's guidelines at all times. 
There should be a governance structure to oversee the way data is stored, used and shared across the 
organization. Physical as well as Virtual security measures should be in place at each touch point for 
Data security. 

At UIDAI, the Analytics and Reporting council also oversees the data sharing and usage policy for the 
function. The UID Business Intelligence system is a "Zero Knowledge of Individual Resident” data 
warehouse. The Analytics and reporting system does not hold or extract any data relating to resident 
demographics, and biometrics. No Personal Identification Information (PII) of any resident is 
available from the UID B1 System to any user. Anonymization of data is a key strategy in the UIDAI data 
design. This implies that all individual IDs are only accessed through a "Reference ID” and not 



their actual Enrolment ID or UID. Just like a roll number would represent an examination paper 
(with all details of a student masked out], the Reference ID would represent the record. Sub¬ 
systems within UIDAI processing communicate with each other using Reference IDs. This 
ensures complete anonymization and privacy of data. Similarly, the BI databases also do not 
contain any individual ID details but only contain the Reference ID information. This ensures 
that there is no way in which individual’s data can be compromised. 

There are multiple layers of security within the UIDAI system. These include physical security 
at the Data centres to data being encrypted (2048-bit encryption] at the source and no 
decryption at any intermediary point. In fact, data is always encrypted when "at rest”. Adequate 
firewalls across each sub-system and audited data access further add to the security features. 

4.8. Process Requirements 

As much as setting in place the organization structure, Hardware and Software is important; it 
is just as necessary to set in place various processes to ensure smooth functioning within the 
framework of the organization. These processes should span requesting data, accessing data, 
creating datasets, getting requirements, incorporating requirements, SLAs etc. 




Complexities and 
challenges in implementation 


Incorporating a strategic initiative of this nature is likely to face challenges during 
implementation. Recognizing these upfront will help ensure that the phased transition is more 
likely to be successful in the organization. Some of these challenges are technological, but the 
more difficult ones are inevitably behavioural. 

1. Technology: Typical challenges faced in implementation arise due to BI/Reporting not 
being part of the original design; or being improperly designed into the architecture of the IT 
systems. This leads to issues at the data capture stage (systems / formats not standardized], 
data managing stage (Data Warehouse not designed for size of data / poor B1 Tools] or the data 
reporting stage (web/offline based sharing of data]. 

2. Silos: Information hoarding within processes is a frequently encountered challenge in 
organizations. Even if data is captured and generated, they often get amassed and remain in 
departmental "silos". 

3. Data inconsistencies: One key issue with a poorly designed BI system is that individual 
functions tend to generate and rely on their own data. This can lead to a lot of confusion and 
undermine the credibility of decisions that are made. 

4. Data Integration: Data integration across multiple sources, as also across vendors and 
partners is an issue that frequently takes a long time to address as organizations scale. 

5. Mindset change: The key challenge is not to adopt systems that can handle data, but to 
adopt the mindset of making decisions using data. The piece that takes the most amount of time 
is the culture change of having data as a backdrop in reviews / discussions and decisions. 

6. Lack of senior level push: Such dramatic changes in mindset and culture need to be 
driven right from the top management consistently. Very often, Analytics ends up being a one- 
off activity / niche support function and tends to get lost in many operational priorities. 

7. Lack of adoption by stakeholders: This typically arises due to a number of reasons 
including lack of awareness on what information is available, lack of accessibility of data and 
lack of data relevant for them; making sponsors lose interest. 

8. Lack of Analytics talent: It is importantto view Analytics and Reporting as different from 
an IT setup. Not having the right kind of talent could lead to the work being very technology 
heavy or too niche to be relevant to actual on-ground operations. 




6 Phased approach to creating 

a sustainable Analytics function 


Creation of the Analytics and Reporting function requires a large change; not just in terms of 
creating a new function and staffing the same, but also how it is integrated with the existing 
government program. This requires a carefully thought out, phased transition plan that 
ensures complete integration and smooth functioning as the change progresses. The phased 
transition should be set against a combination of shortterm and long term milestones. 

The phases would be structured to gain detailed user requirements understanding, setting in 
place the right team and infrastructure, delivering on the requirements as per above and 
conducting awareness and training for continued usage. 

Since this is a transition phase, continuous feedback should be taken to ensure relevance is 
maintained and adoption of Analytics is driven across stakeholders. A combination of 
qualitative and quantitative feedback can be used for the same. 

Senior leadership support is critical to the successful implementation of this transition. A 
consistent push from the leadership, accompanied by usage by them on a regular basis, will 
lead to the percolation of the analytics mind-set, and usage of the system across the 
organization. 






7 ■ Illustrations from UIDAI 


The section lays out in detail how UIDAI has implemented the Analytics function, in the context 
of the data flow, structure and principles that have been laid out in the document. This should 
help the reader understand the main document better by looking at it through the lens of an 
actual example. 

7.1. Setting the Objectives 

The objectives for UIDAI are as below, and have been created keeping in mind the large 
ecosystem, comprising primarily of external partners and a very lean UIDAI structure. 

1. Drive data based decision making: The delivery of the Analytics function should be such 
that stakeholders can easily include data and insights in their operations on a regular basis. The 
function should be able to drive a feedback loop to the overall organization and specific 
processes to help them improve continuously. 

2. Empowers self-improvement: The analytics function should be such that it helps 
stakeholders to improve by themselves. Tools, data and platform should be created to be able to 
help each stakeholder analyze their own performance and operational metrics themselves. 

3. Be scaleable and flexible: Given the breakthrough nature of the project, UIDAI 
requirements will continuously evolve in ways that are difficult to predict. The analytics 
solution (people, technology, infrastructure etc) must be designed to be able to scale up 
alongside and handle newtechnologies and diverse data sources of huge size. 

7.2. Business Intelligence Architecture and Infrastructure 

Enrolment, Processing, Authentication, Updation etc generates large amounts of data. The 
architecture consists of: 

1. Atomic Data Warehouse consisting of atomic data obtained synchronous or 
asynchronously, stored in a custom designed data model for the UID BI Atomic Data 
Warehouse. This is time-variant, consolidated, aggregated minimally to provide such 
information for downstream needs, such as data marts, Charting, sandbox etc. 

2. UID BI EAI consisting of tools and applications to provide for extraction of data from source 
systems into the UID BI Data Warehouse. 

3. UID BI Data distribution platform to enable provisioning of data (through various datasets, 
dim-sets) for all external and internal consumption inclusive of the Public portal of UIDAI. 

4. UID BI Analytical and reporting delivery platform consisting of tools and platform to deliver 
all relevant metrics, dashboards, portals, reports, action-response work-flows etc. 





5. UID BI Data marts consisting of subject area specific or other subsets specific data derived 
from the UID BI data warehouse through a process of aggregation, along with 
relevant dimensionality. 

UIDA1 uses an open source tool called Pentaho to extract data and present it ahead through an 
internal analytics portal. The logical view below (Figure 6] gives a snapshot of the same: 
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Figure 6: Logical view of the BI Architecture 
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Figure 7: Logical view of the ODS Architecture 


















7.3. Delivery platform 

At UI DAI, a dedicated Analytics Portal delivers most of the data and analysis to Stakeholders. 

1. An exclusive Analytics page serves as the landing page for users. The landing page consists 
of dashboards providing All India figures on key Aadhaar processing; with an ability to drill 
down at a geography / Registrar level. 

2. This is a combination of standardized reports and dashboards, as well as interfaces to 
enable self-service analysis. 

3. The portal is able to handle multiple user types and provides customized data accordingly. 

4. The portal provides a separate interface that enables self-service analytics capability for 
users. Users have the option to choose specific dimensions, like Time, Geography and 
Registrars; and create custom data sets with a wide range of metrics provided. 
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Figure 3: Dashboard for Internal Analytics portal 
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Figure 4: Internal self-service portal 
(ability to choose metrics, time periods etc to create custom datasets) 
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Figure 5: Sample of an actual downloadable (Excel) performance report 


7.4. Deliveryteam 

1. Stakeholder Track: 

The responsibility of this dedicated Stakeholder Track team will be to: 

1. Collect detailed data and analytics requirements from each of the stakeholders; 

2. Work with the back-end and front-end to ensure these requirements are implemented; 

3. Handhold these functions initially and guide them on how to use data and perform basic 
level of analysis themselves; 

4. Provide advanced analysis support using tools and software, which the end-stakeholder 
by themselves will notbe able to do; 

5. Provide data where possible through the self-service options; 

6. Train the Stakeholder resource on data pulls and analysis 

Within UIDAI, an initial list of stakeholders has been created. These have been grouped into 5 
different buckets as shown in Table 1 below. This grouping would be a feature even in other 
programs that are beginning Analytics functions since work has not matured for individual 
stakeholders. The grouping has been done so that the Analytics function representatives can 
serve these individual groups of stakeholders together. 


























The initial grouping can be done based on: 

1. Similarity of information requirements among the stakeholders within the Group; 

2. Ensuring the data and Analytics load balances across groups, 

3. Easy sequencing in phase-wise implementation of the Analytics. 


Stakeholder Tracks 

Group 1 

Group 2 

Group 3 

Group 4 

Group 5 

RO 

Chairman / DG office 

Authentication 

Data Centre 

Resident satisfaction 

Registrar/EA 

RoB / Processes 

Quality 

Letter Delivery 

IEC 

Processes 

Finance 



Call centre 

Public 

Training 



Integration 


Table 1: Grouping for Stakeholder tracks for UIDAI 


2. Support Track 

The support tracks will consist of 2 broad set of sub-tracks. 

1. Technology Support Tracks 

This is the Support Track that would provide support that spans most of the processes. 

i. Data design 

There is significant work involved before a metric is ready to be published to 
stakeholders, especially when the metric is being generated for the first time. This 
involves extracting samples of data, working with many of the back-end Technology teams 
and end-stakeholders to ensure that the data is being recorded correctly at the data 
centre, extracted correctly by the queries and defined in line with requirements in the 
back-end. In addition, this team would work with the Bl, Database, and Application team 
in the technology function to define / refine back-end data structures, datamarts and 
data models. 

ii. Delivery platform 

This will be a dedicated team to drive the development and maintenance of the full 
fledged analytics delivery platform. It will help drive automation of reporting and 
data pulls. 















2. Specialized analytics and research Support Tracks: 

i. Visual Analytics and Custom Analytics - A large part of Analytics currently and for some 
time in the future as well will involve reporting; providing basic level information on what 
is happening in operations. Most of the action steps that stakeholders derive on a regular 
basis will be from this reporting. The focus of this team would be to bring data 
to life, handle ad-hoc data requests and custom analysis requests, creating 
management dashboards etc. 

ii. Fraud Detection, Risk Modelling, and Audit support - Identifying areas where fraud is 
happening, modelling to identify which residents / areas are most at risk for fraud. 
These could be related to Enrolment and Authentication fraud related. 

iii . Forecasting / Statistical modelling - Predictive modelling, optimization, scenario 
building towards forecasting. 

iv. Resident and Media Research - Resident survey and research and Media Analytics. 
Integrating call centre data and feedback will be a key role they play. 

7.5. Software requirements 

1. Business Intelligence tools 

a. To access back-end data and provide all kinds of basic data slice and dice capability. 
They could be operating on the File systems at the back-end as also be linked to the 
front-end where they can enable self service. 

b. The basic slice and dice would include multidimensional analysis: drill-down, 
drill-through, roll-up, sort, group, filter and calculation 

2. Visualization Analytics software 

a. This is included in what is today termed as BI 3.0. This includes charting 
software thatcan bring data to life in real time to help make data intuitive to 
understand. 

b. The Analytics market today contains many open-source and paid software that can 
take large amounts of raw data and convert them into appealing and intuitive charts. 
These charting software go beyond the standard charting capabilities provided by 
traditional software; and can also incorporate intelligence within their charting thus 
enabling exception reporting etc. 

c. These could include abilities in Data animation and Mobile compatibility 

3. Analytics software 

a. This is the software that will be used for all the advanced modelling like 
forecasting, segmentation etc. These include commercial packages as well as open 
source based ones. 

b. These will typically be used by the Data Analysts and Managers and will require 
prior knowledge of their usage. 




4. Data security software 

A key principle as stated before is to enable a secure environment where data 
canbe accessed, analyzed and shared. Apart from creating a secure VPN 
connection, appropriate software that can help transfer data, secure data so that it is not 
shared ahead, provide access restrictions etc should be procured. These have to be in line 
with the organization's Data security policy. 

7.6. Guiding principles for design 

Some principles are laid out as below, with relevant examples from UIDAI shared where 
applicable. 

1. Broadbase data across E2E process owners - Today, a private sector Enrolment agency 
on the field can directly view the status of each of their packets in the UIDAI data centre and the 
delivery status with India Post in one single portal. This helps stakeholders and process owners 
to talk to each other in the same terms and help co-ordinate decision making across functions. 

2. Directly relevant to the operations of End-users at all times -Today, there are separate 
reports individually customized for Enrolment agency, Registrar, UIDAI, Financial Inclusion 
team etc within the same. 

3. Active involvement of stakeholders in defining the scope: The customized reports for 
EA, Registrars, Financial Inclusion team etc have been made after detailed inputs and trials 
with the stakeholders. 

4. Provide Self-service ability: UIDAI provides a "self-service” portal to its users that help 
users to clickthrough a variety of metrics and create custom reports. 

5. Ease of understanding and usage - The reports presented by UIDAI are in formatted PDFs 
for ease of printing on the field as well as downloadable in Excel to aid analysis. Charts are 
provided at many places with ability to download data quickly from them. 

6. Awareness and Training: New additions in terms of reports or functionalities are 
communicated to field and partners regularly, and presentations/training sessions. 

7. Feedback mechanism - The standardized reports that have been created have been done 
so after multiple iterations with stakeholders and ensuring they are relevant. 

8. Ability to handle "Big" data: UIDAI has used the most cutting edge data storage protocols 
as well as hardware. The architecture and the datamarts / databases like ODS have been 
designed keeping in mind the huge amounts of data they will eventually store. 

9. Automation and portal based delivery: UIDAI reporting team ensures that once a data 
requirement is standardized, the data pull for that is automated and uploaded automatically 
on the portal. 

10. Consistency of data: There has been a conscious standardization to ensure all data, right 
from upload to delivery is fed into the central UIDAI data store and shared through that. 





Unique Identification Authority of India 

Planning Commission, Government of India 





